A Grammar-based Approach for Compressing XML
نویسندگان
چکیده
XML is a popular meta-language in widespread use across a variety of application domains. However, its verbose nature has limited its acceptance in cases where a more succinct textual or binary data encoding format can be used. In this report, we describe AXECHOP, an XML-conscious compressor which uses a grammarbased approach to exploit the possibly significant structural redundancies within XML documents in order to achieve significant rates of compression.
منابع مشابه
Updates on Grammar-Compressed XML Data
In this paper, we present updates on CluX, a grammar-based XML compression approach based on clustering XML sub-trees. We show that updates on CluX-compressed data can be performed faster than decompressing the data, loading it into main memory and compressing it. Furthermore, we show how to support fast multiple updates, e.g. performing 100 updates in parallel is more than 70 times faster than...
متن کاملInference of node replacement graph grammars
Graph grammars combine the relational aspect of graphs with the iterative and recursive aspects of string grammars, and thus represent an important next step in our ability to discover knowledge from data. In this paper we describe an approach to learning node replacement graph grammars. This approach is based on previous research in frequent isomorphic subgraphs discovery. We extend the search...
متن کاملEfficient Associating Mining Approaches for Compressing Incrementally Updatable Native XML Databases
XML-based applications widely apply to data exchange in EC and digital archives. However, the study of compressing Native XML databases has been surprisingly neglected, especially for the huge amount of data and the rapidly updatable database. These two factors give rise to our interest, and motivate us to develop an approach to efficiently compress native XML databases and dynamically maintain...
متن کاملSchema Extraction from XML Data: A Grammatical Inference Approach
New XML schema languages have been recently proposed to replace Document Type Definitions (DTDs) as schema mechanism for XML data. These languages consistently combine grammar-based constructions with constraintand pattern-based ones and have a better expressive power than DTDs. As schema remain optional for XML data, we address the problem of schema extraction from XML data. We model the XML s...
متن کاملA Template-Based Approach to Summarize XML Collections
Existing summarization approaches for XML concentrate on extracting common structure and compressing the data, to optimize storage and speed up queries. Neither compression, nor structure extraction suffices for advanced, content-based summarization tasks. We present a set of tools for semi-automatic summarization of XML collections, where the user can specify semantically relevant features for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005